Building Affective Lexicons from Specific Corpora for Automatic Sentiment Analysis
نویسنده
چکیده
Automatic sentiment analysis in texts has attracted considerable attention in recent years. Most of the approaches developed to classify texts or sentences as positive or negative rest on a very specific kind of language resource: emotional lexicons. To build these resources, several automatic techniques have been proposed. Some of them are based on dictionaries while others use corpora. One of the main advantages of the corpora techniques is that they can build lexicons that are tailored for a specific application simply by using a specific corpus. Currently, only anecdotal observations and data from other areas of language processing plead in favour of the utility of specific corpora. This research aims to test this hypothesis. An experiment based on 702 sentences evaluated by judges shows that automatic techniques developed for estimating the valence from relatively small corpora are more efficient if the corpora used contain texts similar to the one that must be evaluated.
منابع مشابه
On the Automatic Learning of Sentiment Lexicons
This paper describes a simple and principled approach to automatically construct sentiment lexicons using distant supervision. We induce the sentiment association scores for the lexicon items from a model trained on a weakly supervised corpora. Our empirical findings show that features extracted from such a machine-learned lexicon outperform models using manual or other automatically constructe...
متن کاملTwitter as a Comparable Corpus to build Multilingual Affective Lexicons
Résumé The main issue of any lexicon-based sentiment analysis system is the lack of affective lexicons. Such lexicons contain lists of words annotated with their affective classes. There exist some number of such resources but only for few languages and often for a small number of affective classes, generally restricted to two classes (positive and negative). In this paper we propose to use Twi...
متن کاملInducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
A word's sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words. We show that our approach achieves state-of-the-art ...
متن کاملAre Manually Prepared Affective Lexicons Really Useful for Sentiment Analysis
In this paper, we investigate the effectiveness of different affective lexicons through sentiment analysis of phrases. We examine how phrases can be represented through manually prepared lexicons, extended lexicons using computational methods, or word embedding. Comparative studies clearly show that word embedding using unsupervised distributional method outperforms manually prepared lexicons n...
متن کاملHow Translation Alters Sentiment
Sentiment analysis research has predominantly been on English texts. Thus there exist many sentiment resources for English, but less so for other languages. Approaches to improve sentiment analysis in a resource-poor focus language include: (a) translate the focus language text into a resource-rich language such as English, and apply a powerful English sentiment analysis system on the text, and...
متن کامل